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i  j  RESEARCH  OVERVIEW 

The  objective  of  this  contract  is  to  do  research  supporting  the  development  of  a  computer-aided 
fabrication  system  that  will  serve  the  needs  of  the  American  semiconductor  industry  and  the  VLSI  design 
community.  The  semiconductor  industry  requires  computer  tools  to  enhance  yields  and  reduce  cost.  VLSI 
designers  can  benefit  from  the  ability  to  specify  application-specific  processes  to  fabricate  their  application- 
specific  integrated  circuits. 


In  support  of  these  goals,  specific  projects  are  in  place  to  design,  develop,  and  implement  a 
hardware/software  CAF  system  architecture;  to  develop  aids  to  specifying  process  flows  in  a  modular  way;  to 
develop  models  for  semiconductor  fabrication  equipment;  to  develop  technology  GM&  (TCAD)  tools  for  the 


mechanical  properties  of  integrated  structures;  and  to  develop  and  implement  algoril 
in  scheduling  of  fabrication  operations.  /zTE?  L  £ 


Progress  in  each  of  these  areas 


(^k 

reas  is  described  be)c 


and  programs  to  assist 
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CAF  SYSTEM  STRUCTURE 

We  made  substantial  progress  in  the  development  of  programs  relating  to  our  data  model  and  schema. 
Our  Gestalt  system  architecture  provides  a  uniform  query  interface  to  data  residing  in  multiple  autonomous, 
heterogeneous  data  bases.  The  Gestalt  data  base  interface  routines  were  expanded  to  include  lisp  interface 
routines  in  addition  to  C  interface  routines  by  Ken  IshiL 

Mike  Ruf  has  completed  his  S.M.  thesis  proposal,  “Management  of  IC  Manufacturing  Data,”  August  18, 
1988.  Mike  worked  on  our  CAFE  project  this  past  spring  and  is  now  at  TI  on  his  VI-A  internship  company 
assignment.  While  his  work  is  actually  being  done  at  TL,  I  am  his  MFT  S.M.  thesis  supervisor.  This  thesis  is  an 
example  of  how  our  CAFE  system  data  base  philosophy  is  influencing  related  work  at  TI.  A  comprehensive  set 
of  data  base  tools,  DBTOOLS,  were  generated  by  Ruf.  These  include: 

•  dbinspect  •  which  is  a  “data  base  walker”  program  which  enables  application  programmers  (and 
others)  to  find  their  way  around  the  existing  data  base.  It  displays  an  existing  entity  and  allows  the  user  to 
explore  related  entities.  For  example,  one  can  display  a  facility  and  see  that  it  has  a  list  of  machines.  From 
there  one  can  display  a  particular  machine  and  see  its  attributes,  etc. 

•  dbcreate  -  which  enables  the  creation  of  entities  without  writing  a  special  application  program. 

•  dbqucry  -  which  can  be  used  to  implement  data  base  queries. 

•  dbmutate  -  to  enable  changes  to  be  made  to  existing  entities. 

•  dbchoose  -  to  select  entities  from  the  data  base. 

An  S3,  thesis,  “Schema  Viewer.  A  Graphical  Representation  to  Portray  the  Database  Schema  of  the 
MIT  CAFE  System,"  was  completed  by  Nazhin  S.  Zarghamee  in  May,  1988.  This  software  provides  a  schema 
display  so  as  to  allow  lab  managers  and  users  to  provide  more  meaningful  feedback  as  to  the  appropriateness 
and  utility  of  our  schema. 

A  number  of  application  programs  pertaining  to  status  and  log  reports  were  written  or  modified  by  D.  E. 
Troxel.  These  included  change-status  which  is  used  for  entering  data  relating  to  status  changes,  describe- 
machine  to  display  current  status,  new-machine  to  enter  a  new  machine  into  the  data  base,  format-equipment  to 
report  the  status  of  all  equipment  in  a  facility,  up  which  is  a  generalized  status  and  log  report  generator,  and  dbt 
which  was  used  to  transfer  file  based  data  to  the  data  base.  The  up  program  can  produce  graphs  of  uptime  for  a 
selected  machine  for  a  selected  period  of  time  or,  alternatively,  a  summary  of  the  relevant  log  entries  relating  to 
machine  status  changes.  The  graphs  can  be  output  on  a  terminal  or  laser  printer  via  giraphe3  which  was  written 
by  Duane  Boning  and  Bob  Harris. 

A  new  program,  operate-machine,  was  written  by  Mike  Mdlrath.  This  is  initially  used  to  make  log 
entries  when  a  machine  is  operated.  It  also  provides  a  base  for  the  creation  of  machine  specific  data  entry  and 
wtii  be  expanded  substantially  in  the  near  future.  In  addition  to  being  available  from  the  CAFE  menu,  this 
program  or  procedure  will  be  called  by  the  fabrication  interpreter  in  order  to  actually  effect  the  machine 
operations.  The  operate-machine  program  is  being  expanded  to  create  wip  entities  when  the  program  is  exited 
without  the  finish  time  being  specified.  When  this  is  accomplished  several  programs  (up,  change-status,  and 
describe-machine)  will  have  to  be  modified  to  include  reports  on  active  wips. 

The  next  stage  in  the  development  of  operate-machine  is  to  make  it  specific  to  the  actual  machine  being 
operated.  This  requires  generation  of  opdesc  entities  which  embody  the  description  of  the  parameters  or 
machine  settings  and  the  measurements  or  data  which  are  to  be  collected. 

A  new  program  which  interfaces  with  the  Nanospec  film  thickness  instrument  has  been  developed.  In 
addition  to  operating  the  Nanospec  via  the  computer,  it  now  initiates  measurements  and  places  the  resulting 
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thickness  measurement  in  the  appropriate  field  automatically,  thereby  reducing  the  operator  interaction 
required  to  capture  the  measurement  data  for  data  base  storage. 

A  program  written  by  Peter  Monta  provides  for  an  alternative  direct  interface  between  the  Gyrex  mask 
maker  and  a  computer  as  opposed  to  requiring  users  to  write  data  to  magnetic  tape  and  transport  these  tapes  to 
the  Gyrex. 

We  have  developed  and  demonstrated  a  “hands  off  terminal.”  We  chose  a  commercially  available  TI 
speech  recognition  card  which  plugs  into  an  IBM  PC/XT.  Software  has  been  developed  by  Peter  Monta  to 
interface  the  TI  PC  software  to  control  a  fabform  interface.  As  this  speech  recognition  module  is  speaker 
dependent,  the  software  automatically  loads  the  data  base  appropriate  to  the  login  name.  Several  of  us  have 
“trained”  the  recognition  software  and  the  results  are  quite  interesting.  This  “hands  off  terminal”  is  now  ready 
to  be  installed  as  the  terminal  next  to  the  Nanospec  in  the  fabrication  laboratory. 

We  have  continued  the  development  of  a  process  flow  language  (PFL).  The  creation  of  a  PFL  and 
associated  interpreters  is  the  key  to  our  approach  for  generating  actual  fabrication  instructions  and  for 
collecting  the  data  resulting  from  actual  fabrication  steps.  The  interpreters  provide  the  actual  meaning  of  the 
process  flows  expressed  in  the  flow  language. 

Our  previous  PFL  development  was  based  on  only  the  machine  setting  view  in  order  to  get  something 
working  as  soon  as  possible.  We  now  have  a  version  of  our  PFL  which  is  based  on  the  two  stage  process  step 
model  which  relates  the  goal  of  a  change  in  wafer  state  first  to  the  physical  treatment  parameters  and  finally  to 
the  actual  machine  settings  used  to  process  the  wafers.  We  have  recoded  the  CMOS  baseline  process  in  this 
new  version  and,  in  addition,  have  encoded  a  furnace  monitor  process  which  is  routinely  used  every  week. 

We  have  made  substantial  progress  on  an  expert  PFL  editor.  An  expert  PFL  editor  has  been  completed 
by  Rajeev  Jayavant  as  his  S.M.  Thesis,  “An  Intelligent  Process  Flow  Language  Editor”.  This  editor  uses 
fabform  as  the  user  interface.  Ideally  one  starts  with  an  existing  process  flow,  encoded  in  out  lisp  like  syntax, 
which  is  somewhat  similar  to  the  desired  process  flow.  The  editor  then  displays  this  existing  process  flow  with  a 
forms  based  presentation  and  allows  the  user  to  modify  the  flow.  The  editor  then  produces  the  new  flow 
encoded  in  our  lisp  like  PFL  without  the  user  even  being  aware  of  the  lisp  nature  of  the  PFL.  The  editor 
supports  the  three  views  required  by  the  two  stage  generic  process  model  and,  in  addition,  allows  any  number  of 
hierarchical  levels  of  process  flow  definition. 

Our  standard  user  interface,  fabform,  has  been  improved  and  extended  so  that  it  can  now  be  called  as  a 
procedure  from  either  C  or  Lisp. 

We  have  made  substantial  progress  on  the  development  of  a  simulation  interpreter.  Duane  Boning  has 
completed  his  Ph.D.  proposal,  “Custom  Fabrication  Process  Design:  Tools  and  Methodologies,”  March  8, 1988. 
We  have  also  realized  that  we  must  provide  for  operation  of  partial  flows.  One  impediment  to  the  use  of  our 
PFL  is  that  users  change  their  minds  about  the  process  specification  as  they  progress  with  the  actual  fabrication. 
By  concatenating  the  processing  history  of  fabrication  with  a  number  of  partial  flows  we  at  least  will  have  a 
trace  which  accurately  reflects  what  happened.  A  prototype  version  of  the  Suprem-III  Simulation  Interpreter 
has  been  completed.  The  interpreter  generates  Suprem-III  fragments  for  multiple  one-dimensional  cross 
sections,  produces  a  Makefile  to  minimize  shared  simulations,  and  provides  analysis  (plotting,  sheet  resistance, 
and  threshold  voltage)  capabilities.  Duane  Boning  is  now  working  to  get  an  accurate  description  of  the  CMOS 
Baseline  process,  so  that  meaningful  comparisons  of  fabrication  and  simulation  of  the  MIT  standard  defect 
array  is  possible,  and  so  that  realistic  work  with  the  flow  language  can  progress. 

We  completed  a  substantial  reorganization  and  cleanup  of  our  CAFE  software.  We  now  have  it  all 
existing  under  a  single  directory,  /usr/cafe,  in  preparation  for  distribution  to  our  fabrication  laboratory  and 
perhaps  elsewhere. 
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We  have  acquired  another  computer  and  several  workstations.  We  installed  a  VAX  750,  garcon,  for  use 
as  a  file  server.  We  have  installed  three  monochrome  and  two  color  VS2000s,  two  Symbolics  Lisp  machines, 
and  relocated  the  TT  Explorer.  We  have  installed  a  terminal  concentrator  in  building  13.  All  of  our  computers 
are  now  on  the  same  subnet  with  the  result  that  file  transfer  and  NFS  services  are  now  robust. 

This  summer  we  made  a  major  new  release  of  our  CAFE  software.  We  installed  INGRES  5.1,  an 
updated  schema,  operate-machine,  and  a  series  of  applications  programs  related  to  status  and  log  reports. 
These  programs  were  described  in  previous  progress  reports.  In  addition,  Mike  Heytens  generated  a  number  of 
new  database  access  routines  which  provide  for  more  efficient  filtering  and  sorting  by  implementing  these 
operations  at  the  underlying  data  base  level  instead  of  at  the  applications  programming  level  In  spite  of  these 
optimizations  we  found  that  the  system  response  time  had  deteriorated  substantially  both  due  to  increased 
program  size  and  an  increase  in  the  number  of  active  users. 

We  temporarily  took  the  computer  cafljnit.edu  used  for  CAFE  development  out  of  service  in  order  to 
provide  better  response  time  on  caf2.mit.edu,  which  is  the  computer  used  by  fabrication  laboratory  users.  This 
doubled  the  memory  available  on  caf2  from  8  to  16  megabytes.  Later,  we  were  able  to  borrow  some  memory 
and  we  restored  cafl  to  service  and  now  have  16  megabytes  of  main  memory  for  ca£2.  We  conducted  timing 
tests  on  ca£2  with  16, 24,  and  32  megabytes  of  memory.  Seven  identical  CAFE  operations  were  started  in  quick 
succession.  One  sample  operation  implemented  in  Lisp  took  160  seconds  with  16  megabytes  and  8  seconds  with 
24  or  32  megabytes.  A  different  operation  implemented  in  C  took  125  seconds  with  16  megabytes  and  85  to  90 
seconds  with  24  or  32  megabytes.  The  primary  reason  that  the  Lisp  program  is  slower  than  the  C  program  is 
that  it  is  much  larger  and  the  paging  time  is  thus  longer.  Clearly  we  need  to  acquire  more  memory. 

A  fast  method  of  logging  in  and  out  of  the  laboratory  has  been  implemented  by  Rajeev  Jayavant.  The 
actual  computations  involved  is  not  any  faster  but  the  user  no  longer  must  wait  until  they  finish.  This  has  been 
tested  but  not  yet  installed  on  caf2. 

An  extension  to  the  reservation  program  has  been  made  and  tested,  though  not  yet  installed  on  ca£2.  A 
new  program  to  make  periodic  reservations  was  written  by  Joseph  Kaliszewski  and  he  also  modified  the  existing 
reservations  program.  Periodic  reservation  enables  the  convenient  entry  of  lab  hours  and  weekly  scheduled 
preventive  maintenance.  Reservations  indicated  by  the  periodic  reservation  data  structure  are  merged  into 
blank  areas  of  the  reservation  forms  so  as  to  provide  this  information  to  lab  users.  However,  if  a  lab  user  wants 
to  make  a  reservation  which  overrides  these,  he  or  she  can  do  so. 

We  plan  to  re-implement  the  GESTALT  object-like  interface  to  provide  an  in-memory  storage  option 
for  those  sites  and  applications  who  do  not  want  or  need  a  persistent  database,  and  do  want  and  need  fast,  in¬ 
memory,  object-like  data  structuring. 

We  have  begun  serious  consideration  of  the  integration  of  scheduling  programs  with  CAFE.  Xiewei  Bai 
has  started  to  integrate  fabform  into  his  factory  simulation  programs.  We  have  discussed  schema  additions  to 
represent  the  data  required  for  scheduling  programs  but  have  not  yet  settled  on  their  final  form.  The  present 
approach  is  to  write  a  single  machine  scheduler  first  with  lateT  extension  to  a  multi-machine  scheduler. 

The  programs  relating  to  lot  and  wafer  tracking  have  been  rewritten  and  tested  but  not  yet  installed. 
Create  Jot  and  create jjrocessname  were  rewritten  by  Rajeev  Jayavant  Showlot  and  tracklot  were  rewritten  by 
D.  E.  Troxel  The  latter  two  programs  include  optional  reports  of  the  history  of  opinsts  and  existing  wips. 

A  project  to  automate  the  operation  of  the  HP  wafer  prober  has  been  initiated  by  Merit  Hung  and 
Joseph  Kaliszewski  under  the  direction  of  Professor  Antoniadis.  This  project  is  proceeding  in  three  phases,  the 
first  being  development  of  a  stand  alone  program  on  the  computer  supplied  with  the  wafer  prober.  The  second 
phase  will  be  to  enable  the  operation  of  this  machine  from  another  computer  (caf  and/or  ca£2).  A  third  phase 
will  be  to  integrate  the  operation  of  this  machine  with  the  operate-machine  program  running  under  CAFE. 
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MODULAR  PROCESS 

Development  of  the  Profile  Interchange  Format  (PIF)  has  proceeded  in  two  areas:  implementation  of  a 
PDF  program  interface,  and  development  of  individual  PIF  utilities.  Taking  the  published  proposal  for  the 
"intersite”  (or  ASCII)  PIF  as  a  starting  point,  we  have  been  developing  a  set  of  routines  through  which  CAD 
i  and  CUM  programs  may  access  PIF  objects.  This  “intertool”  format  is  based  on  the  GESTALT  database 

interface  to  a  geometric  and  attribute  schema  definition  tailored  to  the  PIF,  and  provides  a  uniform  or 
"standard”  functional  program  interface.  Tools  implemented  with  this  interface  should,  like  the  data  itself,  thus 
be  portable.  We  have  so  far  developed  a  small  set  of  such  utilities,  including  format  conversion  programs  from 
SUPREM-HI  to  the  PIF  (“sup2pif”  and  “pifisup"),  between  the  ASCII  and  database  versions  (“pi£2db”  and 
“pifdump”),  as  well  as  a  simple  profile  plotting  utility  (“pifjplot”).  Implementation  of  the  interface  and  specific 
PIF  utilities  is  continuing. 

The  Technology  (encompassing  both  Process  and  Device)  CAD  Environment  is  based  solidly  on  the 
wafer  representation  (PIF)  and  the  process  representation  (PFL  or  Process  Flow  Language).  During  the  last 
six  months,  we  have  focussed  on  development  of  two  tools  for  this  environment.  The  first  of  these  is  a 
prototype  Simulation  Manager,  which  has  as  a  key  component  a  translator  from  the  PFL  representation  of  the 
process  to  the  input  required  by  SUPREM-IH.  The  specification  and  handling  of  multiple  one-dimensional 
cross  sections  for  simulation  has  been  a  major  addition  to  the  Manager.  Utilities  allow  minimal  simulation  of 
the  process  (the  MIT  CMOS  Baseline  process  has  been  the  vehicle  for  testing  the  manager),  evaluation  of 
where  the  process  diverges  for  multiple  cross  sections,  as  well  as  interactive  examination  of  simulation  results. 
The  second  area  of  research  we  have  begun  is  to  investigate  simple  process  synthesis  utilities.  We  have  written 
initial  versions  of  an  “Oxidation  Advisor,”  an  "Implantation  Advisor,”  and  a  “Diffusion  Advisor”  to  provide 
physically-based  initial  guesses  for  process  parameters  during  process  design.  Work  is  underway  to  further 
build  “Correction  Advisors”,  as  well  as  to  experiment  with  physically-based  optimization  methods. 
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EQUIPMENT  MODELING 

During  this  rime  period,  progress  was  made  on  three  projects,  and  a  fourth  project  was  initiated.  The 
first  ongoing  project  is  an  equipment  model  for  the  LPCVD  of  doped  polysilicon.  The  second  ongoing  project 
concerns  the  use  of  dimensional  analysis  in  the  design  of  experiments.  Hie  third  ongoing  effort  concerns  the 
use  of  sequential  design  of  experiments  for  process  optimization  and  control  The  new  project  concerns  the 
development  of  a  general  framework  for  process  control. 

A  common  element  in  our  equipment  modeling  efforts  is  the  fusion  of  statistical  design  of  experiments 
with  physically  based  mechanistic  modeling.  Our  first  completed  work  is  an  equipment  model  for  LPCVD  of 
polysilicon  in  a  horizontal  tube  furnace.  In  this  work,  George  Prueger  developed  a  finite  difference  model  for 
the  equipment  and  process  and  calibrated  that  model  using  data  derived  from  designed  experiments.  Our 
current  projects  on  modeling  of  doped  poly  and  the  use  of  sequential  design  of  experiments  seek  to  further  the 
effort  at  the  combination  of  experimental  design  and  physical  knowledge. 

The  project  on  modeling  of  the  LPCVD  doped  poly  is  being  carried  out  by  Master’s  student  Parmeet 
Chaddha  with  experimental  support  coming  from  the  BTU  applications  lab,  in  Billerica,  MA.  The  goal  of  this 
work  is  to  develop  an  equipment  model  which  aids  in  the  design  of  the  cage  in  the  doped  poly  process.  The 
function  of  the  cage  is  to  serve  as  a  deposition  site  for  reactants  formed  in  the  annular  layer  between  the  cage 
and  the  tube  walls,  thereby  resulting  in  uniform  thickness  deposition  on  the  wafers.  Hie  challenge  in  cage 
design  is  to  specify  the  size  and  distribution  of  the  holes  so  that  the  deposit  is  uniform  on  the  wafers,  and  the 
deposition  rate  is  as  high  as  possible.  Our  approach  is  to  develop  a  highly  simplified  analytical  model  which 
predicts  deposition  uniformity  and  rate  as  a  function  of  cage  geometry,  and  to  calibrate  this  model  using  data 
derived  from  designed  experiments.  Roughly  one  half  of  the  experiments  have  been  completed  during  this 
reporting  period,  with  the  work  being  performed  on  4  inch  wafers  at  BTU,  Inc. 

Master’s  student  William  Wehrle  has  been  developing  methods  of  using  dimensional  analysis  in  the 
design  of  experiments.  Dimensional  analysis  is  a  relatively  easy  means  of  deriving  relationships  between 
variables  based  on  physical  arguments.  A  very  general  theorem  called  the  Pi  Theorem  provides  a  set  of  rules 
which  can  be  followed  to  create  the  dimensionless  groupings  which  characterize  a  problem.  The  theorem, 
however,  results  in  a  tremendous  choice  of  sets  of  these  groupings.  Mr.  Wehrle  has  developed  two  rules  which 
narrow  the  choice  down  when  the  Pi  theorem  is  applied  to  the  design  of  experiments.  He  will  test  the  theory  in 
application  to  the  LPCVD  of  LTO. 

Master’s  student  Michele  Storm  is  working  on  the  use  of  sequential  design  of  experiments  for  process 
optimization  and  control  in  collaboration  with  Ultramax  Corp.  of  Cincinnati,  OH.  Sequential  optimization  is  a 
technique  particularly  well  suited  to  manufacturing,  where  processes  are  inherently  sequential.  The  basis  of  the 
method  is  to  perform  a  local  regression  to  the  measured  data,  thereby  calibrating  a  quadratic  response  surface 
centered  near  the  current  point  of  operation.  This  response  surface  may  be  considered  to  be  a  Taylor  series 
representation  of  the  process.  The  polynomial  representation  is  then  used  to  design  the  next  experiment  in  a 
continuous  cycle  of  ‘'learning  and  advising”.  Our  goal  is  to  use  dimensional  analysis  to  create  grouped  variables 
for  use  in  the  models,  as  a  replacement  for  the  primitive  variable  currently  used.  Ms.  Storm  has  written  a 
sequential  optimization  program  and  will  shortly  begin  experimental  work  on  wire  bonding.  She  will  compare 
the  effectiveness  of  grouped  (dimensionless)  variables  with  the  effectiveness  of  primitive  variables  for 
optimization  of  wire  bonding. 

Doctoral  student  Ruey-Shan  Guo  was  hired  toward  the  end  of  the  summer  (August  1988).  Mr.  Guo  has 
spent  the  intervening  months  acquiring  a  background  in  statistics  and  experimental  design. 


MECHANICAL-PROPERTY  TCAD 


The  overall  goal  of  this  project  is  the  development  of  a  design  capability  that  includes  the  mechanical 
properties  of  microelectronic  materials.  This  is  a  new  project,  which  began  in  October  1987.  Hie  specific  goal 
during  this  past  year  has  been  to  develop  experimental  methodologies  with  which  to  determine  the  residual 
stress  in  microelectronic  thin  films.  The  material  selected  for  the  first  set  of  experiments  was  polycrystalline 
silicon,  because  it  is  well  known  that  its  residual  stress  is  very  dependent  on  process  conditions.  Several 
experiments  were  carried  out  on  polysilicon  as  a  function  of  its  deposition  and  doping.  Using  techniques  of 
surface  micromachining,  suspended  polysilicon  beams  and  cantilevers  were  fabricated  in  various  thickness, 
doped  variously,  and  subjected  to  different  sequences  of  thermal  annealing.  The  residual  stress  in  the 
polysilicon  and  the  stress  uniformity  were  then  determined  by  examining  vertical  deflection  of  cantilevers  with 
optical  interferometry.  In  addition,  the  maximum  free-standing  non-buckied  beam  length  was  determined, 
which  can  also  be  related  to  residual  stress.  The  detailed  descriptions  and  conclusions  are  contained  in  two 
papers  which  have  resulted  from  this  work,  which  are  attached  to  this  report. 


SCHEDULING 


Research  on  three  activities  continued  during  this  period: 

•  Systems  level  model  of  the  integrated-dr cuit  fabrication  process. 

•  Mathematical  model  of  an  integrated-circuit  fabrication  facility. 

•  Simulation  and  scheduling  software. 

An  important  phenomenon  was  added  to  the  set  of  models  that  have  been  part  of  our  study,  namely  the 
fact  that  many  semiconductor  fabrication  operations  are  performed  on  a  batch  of  wafers  simultaneously. 
Examples  indude  oxidation,  deposition,  ion  implantation,  and  others.  While  this  is  seen  in  other  kinds  of 
manufacturing,  it  is  pervasive  in  this  industry. 

This  feature  is  important  because  in  order  to  realize  the  full  capadty  of  a  system,  the  machine  chambers 
must  be  as  full  as  possible.  There  are  two  reasons  for  this:  (1)  it  takes  as  much  time  to  do  an  operation  on  one 
wafer  as  100,  but  99%  of  the  capadty  is  wasted  if  only  one  wafer  is  in  a  chamber  that  can  hold  100,  and  (2) 
maintenance  must  be  performed  on  machines  that  do  deposition  operations  when  the  total  amount  deposited 
since  the  last  maintenance  reaches  a  given  level,  independent  of  the  number  of  wafers  that  were  in  the  chamber 
when  the  depositions  took  place. 

One  way  to  keep  the  chambers  fuU  is  to  have  large  lots.  However,  this  is  not  desirable  in  a  system  that 
has  low  volume  or  that  has  a  diversity  of  products.  Another  approach  is  to  group  together  distinct  wafers  that 
require  the  same  operations.  This  leads  to  more  complex  modeling  and  scheduling  issues,  which  we  are 
currently  studying  and  simulating.  A  prototype  scheduler  has  been  built  for  a  single  machine. 
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ABSTRACT 

An  equipment  model  has  been  developed  for  the  low  pressure  chemical  vapor  depo¬ 
sition  (LPCVD)  of  polycrystalline  silicon  in  a  horizontal  tube  furnace.  The  model 
predicts  the  wafer-to-wafer  deposition  rate  down  the  length  of  the  tube.  Inputs  to 
the  model  include:  silane  flow  rates  from  three  injectors,  injector  locations,  loca¬ 
tions  of  and  temperatures  at  three  thermocouples,  operating  pressure,  the  number 
of  wafers,  wafer  diameter,  the  location  of  the  wafer  load,  and  other  physical  di¬ 
mensions  of  the  furnace  such  as  tube  length,  and  inner  diameter.  The  model  is 
intended  to  aid  the  process  engineer  in  the  operation  of  equipment,  including  the 
selection  of  optimum  process  parameters  and  process  control  based  on  measured 
deposition  thicknesses.  The  model  is  also  flexible  enough  to  aid  in  the  design  of 
new  equipment. 

The  one  dimensional  finite  difference  model  encompasses  the  convective  and  dif¬ 
fusive  fluxes  of  silane  and  hydrogen  in  the  annular  space  between  the  wafer  load  and 
tube  walls.  The  reaction  of  silane  is  modeled,  with  full  account  taken  of  the  genera¬ 
tion  and  transport  of  hydrogen.  Kinetic  and  injection  parameters  in  the  model  were 
calibrated  using  a  series  of  nine  statistically  designed  experiments  which  varied  four 
parameters  over  three  levels.  The  model  accurately  predicts  the  axial  deposition 
profile  over  the  full  range  of  experimentation  and  demonstrates  good  extrapolation 
beyond  the  range  of  experimental  calibration.  The  model  was  used  to  predict  a 
set  of  process  parameters  that  would  result  in  the  least  variation  of  deposition  rate 
down  the  tube.  The  predicted  parameters  agree  well  with  experimentally  deter¬ 
mined  optimum  conditions. 
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ABSTRACT 


This  thesis  advances  boundary  element  techniques  to  model  thermal  oxidation  of 
silicon  in  two  dimensions.  At  temperatures  encountered  in  thermal  oxidation,  sili¬ 
con  dioxide  flows  viscoelastically.  A  reduced-dimension,  generalized  boundary  element 
method  for  modeling  such  a  problem  has  been  developed.  With  a  Laplace  transform 
technique,  a  viscoelastic  kernel  function  is  derived  from  Kelvin’s  solution,  which  is  the 
fundamental  solution  to  linear  elasticity.  Constant-velocity  loading  is  chosen  to  oper¬ 
ate  with  a  wide  range  of  stress  relaxation  times.  This  scheme  is  capable  of  replacing 
boundary  element  methods  developed  for  slow  viscous  flow  and  elastic  deformation. 
The  oxidant  diffusion  problem  is  solved  using  a  standard  potential  method  for  Laplace 
problems.  Generated  by  oxide  growth,  stress  affects  both  oxidant  diffusion  and  oxide 
flow.  In  particular,  it  changes  the  diffusivity  of  oxidants  and  viscosity  of  oxide,  render¬ 
ing  the  diffusion  and  flow  problems  nonhomogeneous.  Domain  solutions  axe  needed  for 
both  processes.  A  unified  initial  stress/built-in  field  formulation  has  been  developed 
to  account  for  the  nonlinear  effects,  using  interior  cells  that  are  placed  where  stress 
is  significant.  The  interior  solutions  are  realized  with  an  interfacial  source  method, 
whereby  an  area  integral  for  a  cell  is  transformed  to  a  line  intgral  on  the  perimeter 
of  the  cell.  It  has  been  found  that  kernel  functions  based  on  Kelvin’s  solution  axe  de¬ 
ficient  in  modeling  incompresible  materials  with  a  “hole”.  A  correction  method  that 
uses  a  source  term  placed  at  the  center  of  the  hole  has  been  implemented  to  overcome 
the  numerical  problem. 
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AN  INTELLIGENT  PROCESS  FLOW  LANGUAGE  EDITOR 


Rajeev  Jayavant 


Abstract 

A  process  flow  language  allows  a  user  to  specify  the  process  used  to  fabricate  integrated  circuits  on  a 
silicon  wafer.  By  using  a  flow  language,  a  designer  can  modify  his  process  more  easily  and  processing 
equipment  can  be  reconfigured  for  use  with  different  processes. 

While  the  benefits  gained  from  using  a  process  flow  language  have  been  discussed  frequently,  one  major 
drawback  of  using  a  flow  language  has  been  overlooked:  users  must  code  their  process  flows  in  the  flow 
language.  This  may  not  seem  like  a  disadvantage,  but  it  is  rather  difficult  to  convince  people  to  so 
something  they  have  never  done  before.  Furthermore,  the  process  must  be  in  a  language  that  faintly 
resembles  Lisp,  not  a  very  appealing  thought  for  users  whose  primary  interest  is  in  processing  wafers,  not 
programming  computers.  Thus  there  is  a  severe  need  for  some  tool  to  facilitate  the  coding  of  processes 
flows  in  the  process  flow  language. 

Various  types  of  programming  aids  have  been  used  in  the  past  to  facilitate  software  development:  syntax 
checkers,  semantic  checkers,  preprocessors,  and  intelligent  editors.  The  process  flow  editor  combines 
attributes  from  all  of  these.  The  primary  difference  between  the  process  flow  editor  and  conventional 
editors  is  that  the  flow  editor  presents  the  flow  to  the  user  in  a  format  that  is  very  different  from  the  format 
seen  by  applications  accessing  the  flow. 

By  using  a  different  format  in  presenting  the  flow  to  the  user,  most  people  will  not  have  to  leam  the  Lisp-like 
syntax  of  the  flow  language  and  can  concentrate  on  what  they  really  want  to  do  -  specify  a  process  flow. 
The  current  implementation  of  the  flow  editor  uses  a  forms-based  interlace  to  present  the  flow  as  a 
collection  of  nested  operations.  A  forms-based  interface  is  appealing  because  it  facilitates  the  design  of 
the  editor  while  providing  an  interface  that  lab  users  will  recognize  from  several  CAFE  applications.  The 
use  of  forms  also  allows  the  flow  editor  to  highlight  the  decomposition  of  the  process  flow  into 
parameterized  operations,  thereby  providing  a  more  informative  view  of  the  flow  to  the  user. 

Syntax  and  semantic  checking  is  performed  as  the  user  enters  the  flow.  The  time  required  to  code  a 
process  flow  is  reduced  since  many  common  errors  are  caught  as  they  are  made  rather  than  being 
discovered  at  a  later  time  by  an  interpreter. 
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Abstract 

Doubly  supported  polysilicon  microbeams  are  fabricated  by  removing  a  sacrificial  oxide  layer  from  under 
an  LPCVD  polysilicon  film  using  hydrofluoric  acid  (HF)  as  an  etchant.  Compressive  stress  in  the  beams 
causes  buckling  at  a  critical  length,  which  depends  on  the  polysilicon  and  oxide  process  history.  This  paper 
reports  a  matrix  investigation  of  the  maximum  free-standing  length,  Lmt  of  polysilicon  microbeama  as  a 
function  of  beam  width  and  thickness,  variations  in  doping  and  annealing  conditions,  and  different  sacrificial 
oxides.  The  results  indicate  that  the  Lm  of  polysilicon  microbeams  has  no  significant  dependence  on  beam 
width,  and  increases  with  beam  thickness.  Undoped  beams  which  are  annealed  are  more  rigid  than  undoped, 
unannealed  beams.  Lm  of  boron  doped  beams  is  smaller  than  for  phosphorus  doped  beams,  and  while  an 
annealing  cycle  increases  Lm  for  boron  doped  beams,  it  shows  little  effect  for  phosphorus  doped  beams.  A 
larger  Lm  is  obtained  when  PSG  is  used  as  the  sacrificial  oxide  layer  instead  of  an  undoped  oxide. 

Introduction 

Thin  films  of  LPCVD  polysilicon  have  been  widely  used  in  integrated  circuits.  Recently,  with  the  devel¬ 
opment  of  surface-micromachining  techniques,  polysilicon  is  also  becoming  an  important  mechanical  material 
for  microsensor  and  microactuator  applications.  Along  with  electrical  characteristics,  attention  is  now  given 
to  the  the  effect  of  the  mechanical  stress  of  polysilicon  films.  Many  investigators  have  developed  methods 
for  characterising  the  mechanical  properties  of  films.  Guckel  demonstrated  the  direct  measurement  of  the 
film's  strain  level  with  experimental  determination  of  the  onset  of  buckling  for  known  geometries  [1].  Choi 
reported  the  stress  effects  in  boron-implanted  polysilicon  films  using  comparative  warpage  measurements  [2]. 
Howe  used  relaxed  silicon  overhangs  to  determine  the  compressive  stress  in  polysilicon  and  a-Si  thin  films 
by  measuring  observed  edge  deflections,  which  vary  sinusoidally  [3], 

Doubly  supported  polysilicon  microbcams  have  found  increasing  microsensor  and  microactuator  appli¬ 
cations  [4,  5].  Since  compressive  stress  can  cause  such  beams  to  buckle  when  they  reach  a  critical  length, 
the  design  of  these  microstructures  is  constrained  by  the  maximum  free-standing  length,  Lm,  for  which  the 
microbeam  is  stable  and  maintains  little  deflection.  This  paper  reports  the  fabrication  of  doubly  eupported 
polysilicon  microbeams  of  various  sisea  and  with  varying  doping  and  annealing  conditions  using  conventional 
IC  process  techniques,  and  investigates  the  dependence  of  the  maximum  free-standing  length  on  microbeam 
thickness  and  process  conditions. 

Experimental 

Using  surface-micromachining  techniques,  polysilicon  microbeams  are  fabricated  on  two  groups  of  4- 
inch,  <100>  oriented,  lightly  doped  wafers.  On  the  first  wafer  set,  a  1/im-thick  sacrificial  thermal  oxide  is 
grown  at  950°  C.  The  second  group  of  wafers  receives  a  2  hour,  925aC  diffusion  cycle  with  a  POCls  liquid 

'A  visiting  scientist  from  Shanghai  Component  No. 5  Factory  of  China 
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diffusion  source  Wore  the  ihermel  oxide  is  grown.  This  produces  a  phosphcius  content  of  1.3  wt%  in  the 
thermal  oxide.  The  oxide  layer  on  both  groups  of  wafer*  is  patterned  and  dry  etched  using  CF*  and  CHFj 
to  define  trenches  for  creating  pedestals  for  the  beams.  Pure  silane  is  reacted  at  250  mT  and  025°C  to 
deposit  blanket  LPCVD  polysilicon  films  ranging  in  thickness  from  0.25/101  to  1/im.  Some  samples  receive  a 
60  minute,  925*C  doping  cycle  with  either  a  POCI3  or  BBra  source.  The  PSG  or  BSG  is  removed  from  the 
polysilicon  surface  using  7:1  buffered  HF.  Then  two  sample  groups  containing  both  doped  and  undoped  films 
are  annealed  in  an  N?  ambient  either  at  950® C  for  30  min  (tow  T  anneal),  or  at  1100°C  for  20  min  (high 
T  anneal).  After  anisotropically  dry  etching  the  polysilicon  films  with  CCl*  to  pattern  the  microbeams  and 
remove  the  polysilicon  from  the  backside  of  the  wafers,  the  sacrificial  oxide  film  is  undercut  using  1:1  HF:HjO 
to  release  the  suspended  structures.  Once  the  polysilion  bridges  are  freed  they  are  delicate,  requiring  careful 
handling  during  rinsing  and  drying.  The  samples  are  gently  rinsed  first  with  DI  water  and  then  methanol, 
since  it  has  a  lower  surface  tension  coefficient  than  water.  Up  to  the  removal  of  the  sacrificial  oxide,  the 
fabrication  process  is  compatible  with  standard  IC  process  equipment  and  techniques.  Each  wafer  contains 
60  dies,  and  each  die  contains  81  microbeams  of  5/im,  10/im  and  20/tm  widths.  The  beams  range  in  length 
from  10/un  to  50/un  in  5/un  increments,  from  60 ft  to  200/im  in  10/im  increments,  and  from  200/un  to  300/un 
in  20/un  increments.  The  samples  are  inspected  for  the  maximum  free-standing  length,  Lm,  using  a  Leits 
Model  SMLUX  inspection  microscope  and  Hitach  Model  S-806  low  voltage  scanning  electron  microscope. 
Lm  is  defined  as  the  longest  observed  beam  length  before  the  length  which  buckles.  The  vertical  deflection 
magnitude  and  profile  of  the  unbuckled  microbeams  is  determined  by  interferometric  deflection  measurement 
techniques  (6). 

Results  and  Discussion 

All  of  the  60  duplicated  patterns  on  each  4  inch  wafer  are  inspected.  At  lengths  less  than  the  buckling 
length,  Let  the  phosphorus  doped  beams,  whether  unannealed  or  annealed,  are  undeflected.  While  undoped, 
unannealed  beams  less  than  La  deflect  slightly  upward,  boron  doped,  unannealed  beams  exhibit  large,  upward 
deflections  at  lengths  leas  than  £,.  This  may  be  caused  by  a  doping  dependent  stress  nonuniformity  through 
the  beam  thickness.  £„  is  seen  to  be  the  same  for  S/un,  10/un  and  20/un  wide  beams  across  a  wafer, 
and  from  wafer  to  wafer,  suggesting  that  there  is  little  dependence  of  £n  on  beam  width  for  this  range  of 
widths.  Table  1  summarises  the  dependence  of  Lm  on  beam  thickness  for  undoped  and  phosphorus  doped 
beams.  Figure  1  and  2  illustrate  trends  for  the  dependence  of  the  average  beam  buckling  length,  Le,  on 
beam  thickness.  Lm  and  La  are  larger  for  undoped,  annealed  beams  than  for  undoped,  unannealed  beams, 
with  the  higher  annealing  temperature  producing  the  longest  unbuckled,  undoped  beams.  Lm  and  L„  are 
constant  for  phosphorus  doped  beams,  whether  unannealed  or  annealed,  suggesting  that  the  beams  are  in  a 
stable  equilibrium  state  after  the  doping  cycle. 


Table  1.  Lm  in  Various  Beam  Thickness  t  and  Process  Conditions 


process 

conditions 

L _ . _ _  _  F 

t  si  0.25  ftm 

t  =  0.52  ftm 

t  =0.8  /tm 

t  =  1.07  ftm 

undoped,  unanneal 

20-25  ftm 

30-40  ft m 

45-50  ftm 

50  -70  ftm 

undoped,  low  T  anneal 

20-25  ftm 

35-45  ftm 

50-70  ftm 

70-80  ftm 

undoped,  high  T  anneal 

25-30  /tm 

45-50  ftm 

70-80  ft m 

80-90  ftm 

process 

L _ _  S; 

t  =*  0.2  /im 

t  =  0.46  ftm 

t  =  0.75  ftm 

t,=  1  ftm 

15-20  ft  m 

35-40  ftm 

50-70  ftm 

70-100  ftm 

I  P  doped,  low  T  anneal 

15-20  ftm 

35-45  ftm 

50-70  ftm 

70-100  ftm 

]  P  doped,  high  T  anneal 

15-20  /im 

35-40  ftm 

50-70  ftm 

70-100  ftm 

The  increase  in  buckling  length  for  thin  beams  beyond  that  predicted  by  the  thicker  beams  may  be 
explained  by  a  simple  model  that  accouOBnts  for  compliance  of  the  step-up  beam  supports.  This  model  will 
be  reported  separately.  If  compliance  of  the  beam  support  is  ignored,  a  good  approximation  for  the  residual 
strain  level,  ce,  of  the  1/im  thick  beams  at  the  critical  buckling  length  is  estimated  by  Euler  buckling  theory 
[7]  for  each  process  condition  as 


Umjkr  (J) 

where  t  is  the  polysilicon  film  thickness.  Based  on  Eq.  (2),  Fig.  1,  and  Fig.  2,  c«  and  L,  for  each 
of  the  polysilicon  process  conditions  are  shown  in  Table  2.  These  values  are  in  good  agreement  with  those 
previously  reported  by  Guckel  [1]. 

Table  2.  Buckling  Length  Lc  and  Strain  «e 


KXIMITWOCHCSJt  (MS 


Figure.  1.  Buckling  length  as  a  function  of  film 
thickness  for  polysilicon  doubly-supported 
undoped  beams  with  or  without  annealing 


Figure  2.  Buckling  length  as  a  function  of  Aim 
thickness  for  polysilicon  doubly-supported 
phosphorus  doped  beams  with  or  without 
annealing 


Table  3  compares  Lm  for  boron  and  phosphorus  doped  microbeams,  0.48  or  l/im-thick,  unannealed  and 
annealed.  Although  both  doping  cycles  are  completed  at  the  same  temperature  and  for  the  same  time,  the 
unannealed,  boron  doped  samples  exhibit  a  shorter  Lm  than  the  unannealed,  phosphorus  doped  samples. 
After  annealing,  both  boron  and  phosphorus  doped  samples  display  similiar  free-standing  lengths,  which  are 
the  same  for  phosphorus  doped,  unannealed  beams.  This  suggests  that  phosphorus  doping  has  a  stronger 
effect  on  grain  growth  than  boron  doping.  Mei  mentions  variable  grain  growth  enhancement  with  polysilicon 
phosphorus  doping,  and  little  growth  with  boron  doping  (8{.  Mandurah  reports  that  phosphorus  has  a 
strong  tendency  to  segregate  at  grain  boundaries  while  boron  does  not  [9|.  The  phosphorus  impurities 
segregated  at  the  boundaries  during  the  doping  cycle  may  hinder  boundary  migration,  thereby  inhibiting 
further  polysilicon  grain  growth  during  the  following  anneal  cycle. 
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A l«o  compared  in  Table  3  are  the  free-standing  longthe  for  microbeaxna  fabricated  on  undoped  or  phos- 
phorue  doped  oxide.  Lm  is  dramatically  longer  for  beams  fabricated  on  the  phosphorus  doped  oxide  than  for 
beams  on  undoped  oxide.  This  length  increase  implies  that  the  phosphorus  in  the  underlying  oxide  effects 
both  the  grain  sise  and  doping  profile  in  polysilicon,  therby  reducing  the  level  of  stress  in  the  film.  The 
high  temperature  annealing  cycle  is  more  effective  at  increasing  the  free-standing  length  of  thicker  beams 
than  thinner  beams,  indicating  that  more  redistribution  of  dopant  occurs  in  thicker  beams  during  the  an¬ 
nealing  cycle.  Alternatively,  the  hinge  compliance  of  the  thinner  beams'  supports  may  mask  the  effects  of 
the  annealing  cycle  on  the  free-standing  length. 


Table  3.  Lm  in  Different  Doping  conditions  and  Sacrificial  Oxide  Layers 


Bill 

anneal 

_ Lm _ [I 

t  *»  1  pm 

|  undoped 

B  doped 

unanneal 

25-30  pm 

50-60  pm 

1  undoped 

35-40  pm 

70-120  pm 

IIEE&H2BI 

P  doped 

un  anneal 

35-40  pm 

70-100  pm 

undoped 

laimi 

35-40  pm 

70-100  pm 

P  doped 

P  doped 

unanneal 

60-70  pm 

100-160  pm 

P  doped 

P  doped 

high  T  anneal 

50-60  pm 

120-180  pm 

Conclusion 

Doubly  supported  polysilicon  microbeams  are  fabricated  with  conventional  IC  process  tecniques.  The 
maximum  free-standing  length  of  the  beams  is  seen  to  not  be  a  function  of  width,  but  is  dependent  on  beam 
thickness.  Before  buckling,  undoped  and  boron  doped  beams  which  are  not  annealed  deflect  slightly  upward, 
while  phosphorus  doped  beams,  and  undoped  and  boron  doped  beams  which  are  annealed  are  undeflected. 
Accordingly,  the  annealing  cycle  significantly  increases  Lm  for  undoped  and  boron  doped  beams,  but  does  not 
significantly  alter  Lm  for  phosphorus  doped  beams.  For  the  phophorus  doped  beams,  Lm  <  xt/  Vl-2  X  10~3  ; 
undoped  and  unannealed  beams,  Lm  <  irt/  ^3.2  X  10_i  ;  undoped  beams  with  a  low  temperature  anneal, 
Lm  <  at/  y/2.0  x  10~a  ;  and  undoped  beams  with  a  high  temperature  anneal,  Lm  <  jrt/  \/1.4  x  10~3  .  Use 
of  PSG  as  the  sacrifical  oxide  layer  significantly  increases  Lm,  suggesting  that  more  rigid  microbeams  may 
be  fabricated  using  this  as  the  micromachining  spacer  layer. 
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Abstract 

This  paper  discusses  the  synthesis  of  operating  policies  for  manufacturing  systems.  These 
are  feedback  laws  that  respond  to  potentially  disruptive  events.  We  develop  laws  that  are 
based  on  realistic  dynamic  programming  m<xiels  which  account  for  the  discrete  nature  of 
manufacturing  and  which  are  computationally  tractable. 

These  scheduling  and  planning  policies  have  a  hierarchical  structure  which  is  systematically 
based  on  the  production  process.  The  levels  of  the  hierarchy  correspond  to  classes  of 
events  that  occur  with  distinct  frequencies.  At  each  level,  feedback  laws  select  (1)  times  for 
the  controllable  events  whose  frequency  class  is  treated  at  that  level,  and  (2)  frequency 
targets  for  much  higher  frequency  controllable  events. 

In  this  hierarchy, 

(1)  Most  calculations  deal  with  expected  rates  of  high  frequency  activities,  conditioned  on 
current  states  of  low  frequency  activities.  There  is  an  important  relationship  between 
conditional  expected  rates  at  different  levels. 

(2)  Rates  are  constrained  because  only  one  activity  (e.g.,  production  operation)  can  take 
place  at  one  resource  (e.g.,  machine)  at  any  time. 

(3)  Rates  are  found  at  each  level  according  to  a  dynamic  programming  problem  for  which 
there  exists  good  approximate  solutions.  Times  for  controllable  events  are  chosen  to 
agree  with  those  rates. 
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Abstract 

Many  new  database  applications  require  computational  and  data  modelling  power  simply  not 
present  in  conventional  database  management  systems.  Developers  are  forced  to  design  complex 
encodings  of  complex  data  into  a  limited  set  of  database  types,  and  to  embed  DML  commands 
into  a  host  programming  language,  a  notoriously  tricky  and  error-prone  enterprise. 

In  this  paper,  we  describe  the  design  and  implementation  of  GESTALT,  a  system  and 
methodology  for  organizing  and  interfacing  to  multiple  heterogeneous,  existing  database  sys¬ 
tems.  Application  programs  are  written  in  a  supported  programming  language  (currently  C 
and  Lisp)  using  high-level  data  and  control  abstractions  native  to  the  language.  The  system 
is  flexible  in  that  the  underlying  database  systems  can  easily  be  replaced/upgraded/augmented 
without  affecting  existing  application  programs. 

We  also  describe  our  experience  with  the  system:  GESTALT  has  been  in  daily  operational 
use  at  MIT  for  over  a  year,  supporting  an  information  system  for  CAF,  a  research  facility  for 
the  automation  of  semiconductor  fabrication. 
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Introduction 


Database  management  systems  have  traditionally  been  used  in  administrative  applications  to  pro¬ 
vide  efficient  access  to  large  data  sets,  preserve  data  integrity  and  consistency,  and  to  control 
access.  Recently,  however,  database  systems  have  been  applied  to  diverse  new  domains  such  as 
VLSI  computer-aided  design  and  voice  and  image  processing. 

A  straightforward  application  of  conventional  database  technology  ( e.g current  relational  sys¬ 
tems)  to  these  nontraditional  areas  has  met  with  limited  success.  A  key  problem  is  the  difficulty 
inherent  in  expressing  complex  objects  and  operations  in  terms  of  relations  and  relational  operators. 
Moreover,  conventional  systems  provide  no  general  mechanism  for  abstracting  over  a  given  set  of 
operations  or  data,  thus  all  representation  and  manipulation  must  be  encoded  directly  in  terms  of 
primitive  constructs.  In  many  instances,  such  a  relational  representation  is  cumbersome,  leading 
to  abstruse  application  logic. 

’This  research  was  sponsored  in  part  by  the  Defense  Advanced  Research  Projects  Agency  (DoD)  through  the 
Office  of  NavaJ  Research  under  Contract  Number  N00014-85-K-0213. 

fRoom  36-667,  MIT,  77  Massachusetts  Avenue,  Cambridge  MA  02139,  USA;  Phone:  (617)-253-781 1; 
Arpanet:  heytenBAca2.ait.edu 

lMIT  Laboratory  for  Computer  Science,  545  Technology  Square,  Cambridge,  MA  02139,  USA; 
Phone:  (617)-253-0237;  Arpanet:  nikhilAxr.lcs.ait.edu 
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To  obtain  adequate  expressivity  and  abstraction,  application  developers  typically  embed  query 
language  commands  into  a  host  programming  language.  An  embedded  interface,  however,  is  noto¬ 
riously  tricky  and  error-prone.  Programmers  must  contend  with  both  database  and  programming 
language  concepts,  e.g.,  two  mismatched  type  systems,  two  error /exception  mechanisms,  two  types 
of  control  structures,  etc.  These  inherent  difficulties  make  it  accessible  only  to  database  experts.  A 

A  number  of  research  efforts  are  currently  underway  aimed  at  developing  more  expressive 
database  systems  [1,4,7,11,16].  Still,  conventional  systems  are  likely  to  remain  the  commercially- 
available  state  of  the  art  for  a  number  of  years.  Thus,  a  natural  question  is:  How  can  we  more 
effectively  organize  and  interface  to  today's  database  systems. 

This  paper  presents  one  possible  organization,  developed  as  part  of  the  integrated  circuit  com-  “  ® 

puter  integrated  manufacturing  (IC-CIM)  effort  at  the  Massachusetts  Institute  of  Technology.  Our 
system,  called  GESTALT,  is  currently  employed  by  CAFE  (Computer-Aided  Fabrication  Environ¬ 
ment)  a  sophisticated  information  system  supporting  the  many  facets  of  IC  manufacturing.  From  a 
database  perspective,  CAFE  is  an  especially  challenging  domain  due  to  the  diversity  of  information  — 

it  manages,  and  the  many  complex  programs  it  supports  for  the  acquisition  and  manipulation  of  ® 

semiconductor  manufacturing  data.  (For  a  discussion  of  IC-CIM  data,  see  [8].) 

One  of  the  main  motivations  behind  GESTALT  was  the  need  for  an  application  development 
paradigm  that  did  not  require  database  and/or  programming  language  expertise.  The  implied 
requirement  for  simplicity,  however,  had  to  be  tempered  with  enough  expressive  power  to  permit  a 
conceptually  natural  representation  of  the  complex  structures  found  in  the  domain  of  semiconductor  - 

manufacturing. 

Another  aim  of  GESTALT  was  database  independence ,  i.e.,  the  model  visible  to  applications 
was  to  be  independent  of  the  actual  underlying  database  management  system  or  systems.  This 
capability  provides  a  great  deal  of  implementation  flexibility  as  well  as  an  elegant  way  to  support 
“cross  system”  queries.  The  latter  (similar  to  [3,15])  is  especially  useful  in  a  manufacturing  setting,  * 

e.g.,  it  can  provide  applications  with  a  unified  view  of  design,  manufacturing,  test,  and  product 
sales  data,  even  though  this  information  may  physically  reside  at  multiple  heterogeneous  database 
systems.  Without  this  kind  of  support,  application  programmers  must  query  component  databases 
directly  (after  first  mastering  all  the  necessary  interfaces!)  combining  partial  results  to  form  the 
desired  answer.  This  process  is  complicated,  tedious,  and  error-prone.  J 

Finally,  in  a  design  and  manufacturing  environment,  an  effective  data  management  system  must 
seamlessly  encompass  a  variety  of  CAD/CAM  tools;  it  is  simply  not  feasible  to  rewrite  data  and 
operations  supported  by  these  tools  into  a  single  monolithic  DBMS.  Thus,  the  GESTALT  data 
model  had  to  support  the  databases  and  procedural  primitives  of  such  tools. 

We  begin  by  presenting  an  overview  of  GESTALT  in  the  next  section.  Section  3  describes  the  i 

application  interface,  illustrating  the  programming  model  with  sample  applications.  In  Section  4 
we  describe  the  internal  organization  of  GESTALT,  and  finally,  we  conclude  with  a  discussion  of 
the  experience  gained  from  using  GESTALT  in  CAFE —  a  large,  complicated  system. 

< 

2  Overview  of  GESTALT 

In  essence,  GESTALT  is  a  layer  of  abstraction  which  shields  application  programs  from  underly¬ 
ing  database  systems.  The  application  programming  paradigm  consists  of  a  supported  language 
(currently  C  and  Common  Lisp)  and  a  set  of  abstractions  for  accessing  and  manipulating  persis-  __  1 

tent  data.  An  important  point  is  that  these  abstractions  are  constructed  using  mechanisms  native 
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to  the  host  language,  thus  application  developers  do  not  have  to  contend  with  the  programming 
language-database  dichotomy  typically  found  in  conventional  approaches. 

The  two  supported  languages  have  their  strengths  and  weaknesses.  Many  new  database  appli¬ 
cations  have  an  artificial  intelligence  component,  for  which  Lisp  is  the  preferred  language.  Also, 
the  interactive  nature  of  Lisp  is  convenient  for  posing  ad  hoc  interpreted  queries  (analogous  to 
query  interpreters  in  conventional  systems).  The  C  interface  is  available  for  lower-level  tasks  or  for 
applications  requiring  the  compactness  and  speed  of  compiled  C  code. 

The  basic  architecture  of  GESTALT  consists  of  a  software  system  running  atop  existing  het¬ 
erogeneous  databases.  GESTALT  is  not  itself  a  full-fledged  DBMS,  rather  it  is  a  mechanism  for 
logically  integrating  disparate  data  managers.  This  integration  enables  applications  to  view  data  in 
terms  of  a  unified  “global”  schema.  While  GESTALT  supports  both  read-only  and  update  queries, 
it  does  not  allow  dynamic  schema  modifications.  Such  updates  require  changes  to  the  system  data 
dictionary,  which  must  be  “hand-coded”1  by  the  database  administrator  (DBA). 

Figure  1  illustrates  both  a  conventional  application  architecture  (a)  and  the  organization  of 
applications  written  in  GESTALT  (b).  In  conventional  systems,  applications  are  written  using  an 
embedded  query  language  interface  (e.g.,  QUEL  in  C);  the  resulting  program  is  then  transliterated 
by  a  pre-processor  and  finally  compiled.  Conceptual  entities  and  operations  must  be  encoded 
and  manipulated  explicitly  in  terms  of  primitives  supported  by  the  data  model,  leading  to  much 
complexity  in  coding  applications. 


Application 

Programs 

1/  \ 


(a)  (b) 

Figure  1:  Structure  of  (a)  conventional  systems  and  (b)  GESTALT. 

In  GESTALT,  applications  are  also  coded  in  a  host  programming  language  (C  or  Common 
Lisp.)  Application  programmers  determine  how  to  query  and  update  persistent  data  by  examining 
a  specification  detailing  the  interface  to  a  collection  of  procedural,  data,  and  iteration  abstractions. 
(For  a  good  discussion  of  the  techniques  of  abstraction  and  specification,  see  [10].)  The  abstractions 

lA  structured  schema-interface  tool  has  been  implemented  which  makes  updating  the  data  dictionary  also 
straightforward. 
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which  describe  persistent  data  are  object-oriented  in  nature2,  enabling  applications  to  view  domain 
data  at  a  level  commensurate  with  the  actual  real-world  structure. 

GESTALT  provides  a  rich  set  of  primitive  data  types  and  explicit  support  for  null  objects. 
Without  system  support  of  this  kind,  programmers  often  resort  to  nonstandard  representations  of 
null  values  and  awkward  encodings  of,  say,  temporal  and  engineering  data.  In  a  large  project  where 
communication  between  developers  is  difficult,  this  can  lead  to  inconsistencies  and  incompatibilities 
between  applications. 

Many  of  the  procedural  abstractions  supported  by  the  system  are  higher-order  (e.g.,  maps  and 
filters).  Operations  of  this  sort  encourage  a  clear  and  elegant  style  of  programming,  allowing  concise 
expression  of  a  wider  range  of  computational  processes  [12,18]. 

3  The  Application  Interface 

The  interface  visible  to  application  programs  is  similar,  in  many  respects,  to  the  functional  data 
model  [12,14].  A  GESTALT  schema  consists  of  a  specification  of  entity  or  object  types,  and 
a  collection  of  operations  defined  on  instances  of  those  types.  This  information  is  available  to 
application  programmers  in  two  different  forms.  The  first  is  a  textual  specification  (maintained 
by  the  system)  which  provides  an  overview  of  each  object  type  and  a  description  of  the  effects, 
modifications,  and  requirements  of  the  associated  operators.  The  second  is  the  data  dictionary;  all 
query  facilities  supported  by  GESTALT  are  available  for  accessing  schema  information  stored  in 
the  system  catalogs.  The  interpreted  Lisp  interface  provides  a  convenient  mechanism  for  browsing 
through  meta-data  of  this  form. 

At  the  end  of  this  section  we  present  an  example  application  as  a  concrete  illustration  of  the 
programming  model.  We  first,  however,  examine  various  aspects  of  the  interface  in  more  detail. 

3.1  Object  Types 

All  data  in  a  GESTALT  system  are  viewed  as  abstract  types,  i.e.,  each  abstract  type  specifies 
an  object  type,  a  list  of  named  typed  slots  (which  record  attributes  and  relationships)  and  the 
set  of  procedures  available  for  manipulating  objects  of  that  type.  This  is  a  very  powerful  tool, 
enabling  the  DBA  to  add  new  kinds  of  data  objects  cleanly,  effectively  extending  the  application 
programmer’s  virtual  machine. 

Operations  in  data  abstractions  can  be  divided  into  the  following  categories: 

selectors  —  procedures  to  select  an  object  component 
mutators  —  procedures  to  update  or  delete  an  object 
constructors  —  procedures  to  create  objects 
generators  —  procedures  to  generate  null  objects  and  iterators 

A  GESTALT  system  contains  both  pre-defined  and  domain-specific  abstract  types.  The  latter 
are  abstractions  of  objects  present  in  the  application  domain,  while  the  former  encompass  all  the 
built-in  types,  including  an  extended  set  of  primitive  entity  types,  abstract  types  for  the  data 
dictionary,  and  list-  and  tuple-structured  types. 

2In  the  following  sense:  Real-world  entities  are  modelled  as  “objects”  of  various  abstract  types,  objects  can 
“contain”  other  objects,  and  objects  can  share  other  contained  objects.  We  do  not  model  inheritance. 
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It  is  important  to  realize  that  even  though  a  particular  type  may  be  built-in,  its  behavior  is  still 
specified  only  in  terms  of  the  operators  defined  on  instances  of  that  type,  so  that  the  user  sees  no 
semantic  difference  between  built-in  and  user-defined  types.  Hiding  the  representation  of  built-in 
types  enables  the  system  to  perform  run-time  type  checking.  It  also  eliminates  the  possibility  that 
applications  depend  on  particular  encodings  of  values. 


3.1.1  Pre-defined  Types 

The  set  of  primitive  types  supported  was  designed  specifically  to  facilitate  modelling  the  temporal 
and  engineering  data  commonly  found  in  manufacturing  and  design  domains.  The  atomic  members 
of  this  set  are: 

TEXT  -  variable  length  array  of  bytes 
INTEGER  -  fixed  point  number 
FLOAT  •  floating  point  number 
BOOLEAH  -  “true”  or  “false” 

DATE  -  month,  day,  and  year 

TIMEOFDAY  -  hours,  minutes,  and  seconds 

TIMEDURATIOM  -  elapsed  time  in  hours,  minutes,  and  seconds. 

Also  pre-defined  are  several  inexact  and  interval  types.  The  inexact  types  record  a  value  and 
an  uncertainty;  interval  types  consist  of  an  upper  and  lower  bound.  These  provide  a  convenient 
means  of  encoding  engineering  and  scientific  data,  e.g.,  the  thickness  of  a  layer  of  epitaxial  silicon 
(say  1250  ±  30  A)  can  be  naturally  expressed  as  an  INTEGER_INEXACT  with  the  appropriate  value 
and  uncertainty. 

To  provide  applications  with  a  convenient  mechanism  for  grouping  related  objects,  GESTALT 
provides  the  structured  types  LIST  and  NTUPLE  (n-tuples).  The  former  is  a  homogeneous  sequence 
of  objects  of  unspecified  length,  while  the  latter  is  a  heterogeneous  collection  of  a  fixed  number  of 
objects. 

GESTALT  provides  operators  to  map  between  pre-defined  types  and  host  programming  lan¬ 
guage  values.  These  routines  are  needed  because  the  representations  of  even  atomic  GESTALT 
types  are  hidden  from  application  programs.  For  example,  the  following  Lisp  procedure  prints 
the  id,  crystal  orientation,  and  sheet  resistivity  of  a  specified  wafer  by  applying  dbl-coerce —  an 
operator  which  coerces  atomic  built-in  data  objects  to  corresponding  Lisp  values —  to  the  desired 
wafer  attributes. 

(defun  wafer-print  (w  output -stream) 

(let  ((wd  (wafar-waf erdesc  w))) 

(format  output-stream  "wafer  'A:  <*A>,  "A  ohm/ square 
(dbl-coerce  (wafer-id  w)) 

(dbl-coerce  (waf erdesc-orientation  wd)) 

(dbl-coerce  (waferdesc-sheet-resistivity  wd))))) 

The  data-dictionary  is  represented  as  a  pre-defined  set  of  abstract  types,  complete  with  selectors 
and  generators,  so  that  the  same  computational  model  can  be  used  to  peruse  it.  These  include: 

TYPE  -  built-in  and  domain-specific  types 
ATTRIBUTE  -  object  components  or  slots 
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OPERATOR  -  structure  for  describing  operators 
OPERATORARG  -  formal  operator  arguments 


3.1.2  Domain-Specific  Types 

It  is  the  responsibility  of  the  DBA  to  augment  the  data  dictionary  accordingly  with  the  domain- 
specific  abstract  types  for  a  given  domain.  We  have  implemented  many  tools  to  facilitate  this 
activity. 

An  example  specification  is  shown  in  Figure  2.  Such  specifications  are  automatically  generated 
by  the  system  from  information  stored  in  the  data  dictionary  and  are  used  by  application  devel¬ 
opers.  The  first  three  operations  (machina_name,  machine. avail  able,  and  machine  _Labuser)  are 
selectors;  they  take  objects  of  type  MACHINE  and  return  an  object  corresponding  to  the  desired 
property  or  attribute.  The  next  two  (modify .machine-available  and  modify_machine_l abuser) 
provide  a  mechanism  for  mutating  or  modifying  objects.  Entities  of  type  MACHINE  are  created  by 
constructor  createjuachine;  this  operator  takes  a  name,  labuser,  available  triple  and  returns  a 
newly-created  machine.  Finally,  generators  null_MACHINE  and  machine.it  er  at  or  produce  a  null 
machine  and  machine  iterator,  respectively. 

Another  example  from  the  CAFE  system  is  the  type  WAFER,  with  components:  a  TEXT  id,  a 
wafer  description  (WAFERDESC,  which  records  dopant,  crystal  orientation,  sheet  resistivity,  etc.), 
and  a  BOOLEAN  flag  indicating  whether  or  not  a  layer  of  epitaxial  silicon  is  present. 

3.2  Procedural  Abstractions 

GESTALT’s  procedural  abstractions  include  generic  operators  for  LISTs  and  NTUPLEs,  as  well  as 
common  computational  processes  (such  as  maps  and  filters)  packaged  as  higher-order  procedures. 
For  example,  the  following  Lisp  code  prints  all  wafers  from  a  named  lot  (a  logical  set  of  wafers) 
with  resistivity  values  below  a  certain  threshold: 

(map  (lambda  (w)  (wafer-print  w  t)) 

(filter  (lot-wafers  (lot-with-name  name)) 

’low-resistivity?)) 

where  name  identifies  the  desired  lot.  Note  that  map  and  filter  take  arbitrary  procedures  as 
arguments.  Applications  can  also  make  use  of  the  procedure  mechanism  of  the  host  language  to 
create  new  procedural  abstractions. 

In  addition,  a  variety  of  others  procedures  are  supported,  e.g.,  to  copy  and  (extensionally) 
compare  data  objects,  a  simple  mark-and-release  memory  management  scheme,  etc. 


3.3  Iterators 

GESTALT  iterators  offer  a  convenient  and  space-efficient  way  to  examine  instances  of  a  particular 
type,  and  are  inspired  by  CLU  [10].  Iterators  serve  the  function  of  “retrieve  loops”  or  “record 
cursors”  found  in  conventional  embedded  query  languages.  For  example,  the  following  fragment  of 
C  code  illustrates  how  an  application  might  operate  on  all  machines: 
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Overview 


A  NACHIVE  is  a  pises  of  equipment  ussd  in  tbs  fabrication  of  integrated 
circuits.  Attributes  nans,  labuser,  and  available  are  maintained  for  each 
machine,  recording  the  machine’s  name,  a  list  of  labusers  waiting  to  use  the 
machine,  and  the  current  availability,  respectively.  Only  the  latter  two  are 
mutable. 

Operations 

TEXT  aachineuiaae(m) 

NACEIVE  a; 

effects  Returns  machine  name. 

BOOLEAI  machine  .available  (m) 

MACH I IE  m; 

effects  Returns  machine  availability. 

LIST  machine_labuser(m) 

MACHIIE  m; 

effects  Returns  list  of  labusers  waiting  to  use  machine. 

MACHIIE  modifyjnachine.available(m,  available) 

MACHIIE  m;  BOOLEAI  available; 
modifies  m. 

effects  Sets  availability  of  m  to  available. 

MACHIIE  modifyjnachine_labuser(m,  labuser) 

MACHIIE  a;  LIST  labuser; 
modifies  m. 

effects  Sets  labuser  list  of  m  to  labuser. 

MACHIIE  create .machine (name,  available,  labuser) 

TEXT  name;  BOOLEAI  available;  LIST  labuser; 
effects  Returns  newly  created  machine, 
requires  lame  attribute  must  be  unique. 

MACHIIE  null Jf ACHIIE ( ) 

effects  Generate  null  machine. 

ITERATOR  machine-iterator () 

effects  Returns  a  machine  iterator. 

Figure  2:  MACHINE  data  abstraction. 


while  (  BOOLEAMtoiC  iter_more(  machines  )))  { 
currant .machine  *  itar_current<  machines  ) ; 

. . .  computation  involving  current.machine  ...  • 

>  \ 

where  machines  is  an  iterator,  and  iter jnore  and  iter.current  test  for  an  exhausted  iterator 
and  return  the  head  of  an  iterator,  respectively. 

I  3.4  Type  and  Exception  Checking  “  ^ 

All  GESTALT  operators  perform  dynamic  type  checking.  While  static  type  checking  could  (in  prin¬ 
ciple)  be  performed,  an  unobtrusive  implementation  would  require  modifications  of  the  C  compiler 
and  Lisp  interpreter,  which  we  wished  to  avoid. 

I  Certain  operators  also  perform  null  checks  at  runtime.  Generally  speaking,  operators  that  ■ 

examine  object  components  are  strict  with  respect  to  null  objects,  whereas  operators  which  do  not 
examine  components  are  nonstrict.  For  example,  a  null  wafer  can  be  freely  included  in  a  list  or 
tuple;  a  null  exception  is  detected  only  when  an  attempt  is  made  to,  say,  select  the  id  component. 

Routines  that  detect  type  or  null  errors  generate  a  run-time  warning  message  and  return  a  null  , 

1  object  consistent  with  their  range  type.  If  an  attempt  is  made  to  coerce  a  null  object  to  a  host 

language  type,  then  a  pre-defined  value  is  returned,  after  printing  a  suitable  warning  message. 

In  the  case  of  domain-specific  abstractions,  the  DBA  is  free  to  augment  the  run-time  checking 
with  additional  integrity  constraints  (or  invariants  in  programming  language  parlance.) 

|  ■  ....  i 

3.5  Object  Persistence 

In  GESTALT,  all  objects  of  a  user-defined  type  persist,  whereas  objects  of  atomic,  list  and  tuple 
types  are  ephemeral  (unless  they  are  part  of  a  persistent  type).  For  example,  the  constructor 
I  createjnachine  creates  a  persistent  machine  object,  whereas  itoINTEGER  -  which  coerces  a  C  | 

integer  value  to  a  GESTALT  INTEGER-  returns  a  heap-based  object  whose  lifetime  is  bounded  by 
the  duration  of  the  enclosing  program. 

Associated  with  each  user-defined  type  is  a  single  persistent  extent.  The  system  automatically 
updates  these  extents  in  response  to  creation  and  deletion  of  objects  of  the  appropriate  type. 

I  Applications  can  easily  examine  the  contents  of  an  extent  via  an  iterator.  I 

3.6  A  Sample  Application 

As  an  illustration,  we  present  a  simple  C  application.  The  program,  which  utilizes  the  data  abstrac- 
I  tion  of  Figure  2,  informs  the  next  laboratory  user  waiting  to  use  a  particular  piece  of  processing  I 

equipment  when  he  or  she  is  free  to  do  so.  The  code  is  shown  in  Figure  3;  for  brevity,  it  assumes 
(1)  there  are  always  waiting  users  for  a  machine,  and  (2)  a  procedure  send_mail_msg,  which  does 
the  obvious  thing. 

The  application  constructs  a  machine  iterator  (machines)  and  then  uses  it  to  examine  the 
|  availability  of  each  machine.  If  a  machine  is  available,  a  message  is  sent  to  the  next  labuser,  and  1 

the  availability  and  waiting  list  are  updated.  Finally,  the  iterator  is  deallocated. 
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♦include  "specification. h" 

nainO 

{ 

ITERATOR  machines  =  machine.iteratorO ; 

MACH I IE  current .machine; 

while  (  B00LEAItoi(  iter_more(  machines  ))) 

•C 

current .machine  =  iter.cnrrent(  machines  ); 

/*  grab  machine  at  head  of  iterator  */ 

if  (  BOOLEAHtoiC  machine .available (  current .machine  ))) 

LIST  labnsers  =  machine_labuser(  current .machine  ); 

s end. mail _ms g (  machine.nameC  current.machine  ),  head(  labusers  )); 
/*  inform  next  user  •/ 

modify_machine_available(  current.machine,  itoBOOLEAK  0  )); 
modif y _mach ine.labus er (  current.machine,  tail(  labusers  )); 

/*  modify  availability  and  labuser  list  */ 

> 

> 

iter_free(  machines  ) ; 

> 


Figure  3:  Sample  C  application  program. 


Notice  that  nowhere  is  the  programmer  forced  to  deal  with  representational  issues;  one  need 
only  concentrate  on  the  logic  of  the  application  at  hand. 

Note  also  that  one  does  not  have  to  annotate  or  flag  the  database  commands  in  a  program 
( e.g the  •#  of  embedded  QUEL  and  embedded  SQL’s  EXEC  (6].)  The  programming  language 
and  database  have  been  integrated  into  a  single  framework,  so  that  the  actual  encoding  of  a  task 
closely  matches  the  corresponding  abstract  computational  process,  resulting  in  programs  that  are 
perspicuous  and  easily  modifiable. 

Finally,  since  programs  have  no  way  of  determining  where  data  actually  reside,  the  DBA  is  free 
to  change  the  underlying  database  systems  without  compromising  the  correctness  of  application 
programs.  (They  will  have  to  be  relinked,  however.) 

4  The  Implementation  of  GESTALT 

Our  implementation  relies  heavily  on  the  proven  software  engineering  principles  of  abstraction  and 
modularization.  First,  such  techniques  constitute  a  programming  methodology  which  is  effective 
at  controlling  the  complexities  inherent  in  any  large  programming  effort.  Second,  since  the  DBA 
was  expected  to  modify  the  implementation  (e.g.,  to  add  a  new  component  database)  a  clear, 
well-partitioned  internal  structure  was  deemed  essential. 

In  this  section  we  present  the  internal  structure  of  GESTALT,  including  a  discussion  of  the 
steps  required  to  modify  the  system.  We  conclude  by  describing  the  system  configuration  currently 
in  use  by  CAFE. 

4.1  System  Architecture 

The  organization  of  GESTALT  (Figure  4)  is  similar  to  that  of  Multibase  [15].  The  schema  ar¬ 
chitecture  consists  of  a  GESTALT  global  schema,  and  a  GESTALT  local  schema-local  DBMS 
schema  pair  for  each  component  system.  As  in  Multibase,  the  latter  insulates  the  global  system 
from  local  database  details,  allowing  it  to  be  structured  in  a  clean  and  extensible  manner.  All 
component-specific  details  are  confined  to  translator  modules  (one  per  underlying  database)  which 
are  responsible  for  mapping  operations  on  GESTALT  local  schemas  to  local  database  operations. 

GESTALT,  however,  is  responsible  for  translating  global  requests  into  operations  (in  terms  of 
GESTALT  local  schemas)  on  one  or  more  of  the  underlying  databases.  This  translation  is  performed 
by  the  GESTALT  evaluator,  which  is  coded  in  a  manner  independent  of  the  number  and  nature  of 
the  underlying  database  systems.  All  such  dependencies  are  recorded  in  a  dispatch  table,  enabling 
the  DBA  to  extend  the  system  in  a  straightforward  way.  This  table-driven  approach  is  possible 
because  the  evaluator  assumes  a  standard  interface  to  each  of  the  underlying  systems. 

The  system  is  made  available  to  applications  in  the  form  of  a  library  of  procedures.  This  library 
is  either  linked  into  compiled  application  programs  (e.g.,  in  C),  or  made  available  to  an  interpreter 
(e.g.,  in  Lisp). 

4.2  Modifying  the  System 

Due  to  the  modular  implementation,  modifying  the  system  is  relatively  straightforward.  Modifi¬ 
cations  are  required  when  the  underlying  database  configuration  changes  or  when  abstractions  are 
added,  deleted,  or  modified. 
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Adding  a  database  to  the  system  requires  updating  the  dispatch  table  used  by  the  evaluator, 
recording  the  new  database  in  the  data  dictionary,  and  incorporating  the  corresponding  implemen¬ 
tation  module  into  the  system. 

Altering  the  set  of  procedural  abstractions  supported  by  the  system  simply  entails  adding, 
deleting,  or  modifying  the  appropriate  library  routine.  When  updating  data  abstractions,  however, 
things  are  a  bit  more  involved.  In  the  simplest  case,  modifications  of  existing  data  abstractions  only 
involve  source-level  changes  to  routines  in  the  corresponding  abstraction  module  (e.g.,  performing 
additional  integrity  constraint  checks  prior  to  invoking  the  evaluator).  Modifications  that  axe 
tantamount  to  changing  the  GESTALT  schema  require  adding  or  deleting  abstraction  operators, 
entering  new  or  updated  information  into  the  data  dictionary  and  (possibly)  changing  the  schema 
at  one  or  more  of  the  underlying  database  systems. 

As  an  illustration,  consider  the  steps  required  to  introduce  type  LABUSER,  which  captures  the 
notion  of  a  certified  laboratory  user.  The  first  step  is  to  add  the  approriate  type,  attribute,  and 
operator  information  to  the  data  dictionary.  Secondly,  suitable  logical  structures  must  be  created 
at  one  or  more  of  the  underlying  databases.  Finally,  a  LABUSER  specification  and  implementation 
module  are  generated  (based  on  information  in  the  data  dictionary)  and  incorporated  into  the 
system. 
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4.3  Sample  Configuration,  and  Experience 

GESTALT  has  been  the  database  manager  of  the  CAFE  system  since  January,  1987.  Since  then, 
approximately  50  programs  totalling  some  30,000  lines  of  GESTALT  code  have  been  written  by 
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several  applications  programmers.  An  example  is  an  electronic  machine  reservation  system  (EMR) 
which  provides  laboratory  users  with  a  convenient  mechanism  for  reserving  time  on  processing 
machines.  EMR  eliminates  paper  sign-up  sheets  from  the  clean-room,  a  potential  source  of  con¬ 
taminating  particles.  The  system  is  regularly  used  in  the  microtechnology  laboratories  at  MIT. 

An  early  database  configuration  used  by  CAFE  had  GESTALT  running  atop  three  component 
databases:  University  INGRES  [17],  PRELUDE  [13],  and  a  home-brewed  system.  Each  of  these 
databases  made  an  important  contribution  to  the  overall  system.  INGRES  provided  a  reliable, 
fully-functional  data  manager.  However,  due  to  its  lengthy  startup  time,  it  was  unable  to  support 
applications  requiring  fast,  simple  data  access.  To  remedy  this,  PRELUDE —  a  lightweight  ASCII- 
based  DBMS —  was  incorporated.  Finally,  the  home-brewed  system  was  employed  to  store  the 
large,  variable-length  data  objects  commonly  found  in  the  IC  manufacturing  domain.  Storing  such 
objects  in  either  of  the  other  systems  would  have  been  difficult  or  impossible. 

The  current  configuration  utilizes  only  two  databases:  commercial  INGRES  from  Relational 
Technology,  Inc.[9]  and  WiSS,  the  Wisconsin  Storage  System  [5].  Most  CAFE  data  are  stored 
in  INGRES;  WiSS,  because  of  its  support  for  variable- length  data  items,  handles  those  objects 
not  easily  captured  by  the  relational  data  model.  For  example,  WiSS  is  used  to  store  voluminous 
execution  “traces”  of  program-like  process  flow  descriptions. 

Despite  the  wholesale  database  changes  to  move  to  the  current  configuration,  application  pro¬ 
grams  remained  unchanged —  they  only  needed  relinking.  Thus  systems  using  GESTALT  are  not 
locked  in  to  a  particular  database  system;  they  are  free  to  incorporate  newer,  more  powerful  data 
managers  as  they  become  available. 

CAFE  developers  may  still  write  applications  using  INGRES  or  WiSS  directly,  e.g.,  to  take 
advantage  of  INGRES’  report-writer  tools. 

5  Conclusions  and  Future  Work 

We  have  described  GESTALT,  a  system  which  offers  an  expressive  application  programming  paradigm 
in  which  the  database  and  programming  language  have  been  integrated  into  a  single  framework. 
The  system  provides  uniform  access  to  existing  heterogeneous  databases. 

While  GESTALT  offers  a  number  of  benefits,  there  are  some  difficulties  and  limitations  inherent 
in  its  approach.  One  obvious  limitation  is  the  lack  of  a  dynamic  data  definition  capability.  Because 
GESTALT  does  not  have  control  over  the  entire  programming  environment,  this  is  simply  not 
feasible  in  the  system.  In  general,  this  problem  is  very  difficult  in  a  multiple  heterogeneous  database 
environment  (it  is  analogous  to  view  updating  in  conventional  systems). 

Another  limitation  is  the  lack  of  global  concurrency  control  and  recovery.  Although  all  interac¬ 
tions  with  component  databases  are  atomic,  the  system  contains  no  general  mechanism  for  grouping 
several  GESTALT  operations  together  into  a  transaction. 

GESTALT  does  no  query  optimization  of  its  own,  thus  certain  kinds  of  queries  execute  ineffi¬ 
ciently.  For  example,  tasks  implementing  the  equivalent  of  a  multiway  join  are  not  supported  well, 
unless  there  is  a  data  abstraction  that  executes  it  entirely  within  a  component  database  system. 

Our  experience  to  date  with  the  system  has  been  encouraging:  CAFE  application  developers 
have  responded  very  positively.  Despite  the  wide  range  of  programmer  experience  (from  novices 
to  veteran  software  engineers)  all  have  commented  on  how  easy  the  model  was  to  understand,  and 
how  quickly  they  were  able  to  produce  sophisticated,  working  applications. 
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An  immediate  need  that  we  plan  to  address  soon  is  a  coherent  access  control  strategy.  Currently 
applications  using  GESTALT  must  rely  on  component  database  and/or  operating  system  protection 
mechanisms. 

Our  longer-term  plans  involve  extending  GESTALT  so  that  it  is  a  computationally  complete, 
stand-alone  system,  incorporating  its  own  persistent  objects.  We  are  currently  exloring  a  functional 
database  programming  language  with  an  immutable  or  functional  database  [11].  We  feel  that  this 
combination  offers  an  expressive  system  for  high  level  applications  programming,  admits  much 
parallelism  (for  high  performance),  and  facilitates  the  management  of  historical  data. 
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